<!DOCTYPE html>
<html>

<head>
    <meta charset="utf-8">
    <title>
        CV Alex Rogozhnikov
    </title>
    <style>
        body {
            font-size: 14px;
            font-family: Helvetica Neue, Helvetica, Arial, Liberation Sans, Roboto, Noto, sans-serif;
            /*font-family: sans-serif;*/
        }

        body {
            max-width: 800px;
            margin: auto;
        }

        .demonstrations img {
            float: left;
            width: 130px;
            height: 130px;
            margin: 5px;
            margin-right: 10px;
            border-radius: 75px;
            box-shadow: 0 0 10px rgba(0, 0, 0, 0.6);
        }

        .demonstrations {
            display: flex;
            flex-wrap: wrap;
            justify-content: space-between;
        }

        .demonstrations>div {
            width: 45%;
        }

        .pagebreak {
            page-break-before: always;
        }

        summary {
            font-size: 1.2em;
            color: #667;
            text-decoration: underline;
            margin: 5px 0px;
        }

        h2 {
            color: #038;
        }
    </style>
</head>

<body>
    <!-- <div style="width: 110px; height: 110px; border-radius: 100px; box-shadow: 0px 0px 5px 0 #000; float: left;
            margin: 25px; display: inline-block;
              background-position: center center;
              background-size: cover;
              background-image: url('images/alex_rogozhnikov.jpeg');">
    </div> -->
    <!--<img src='images/alex_rogozhnikov.jpeg' style='height: 400px; width: 400px;' />-->
    <h1>Alex Rogozhnikov</h1>

    <!--russian: Алексей Михайлович Рогожников-->
    <p>
        Email: <a href='mailto:alex.rogozhnikov@yandex.ru'>alex.rogozhnikov@yandex.ru</a>
        <br />
        Blog: <a href='https://arogozhnikov.github.io'>Brilliantly wrong</a>
        <br />
        Github: <a href='https://github.com/arogozhnikov'>arogozhnikov</a>
        <br />
        Google Scholar: <a href='https://scholar.google.com/citations?user=lKrlBjwAAAAJ&hl=en'>Alex Rogozhnikov</a>
        <br />
        <!-- ResearchGate: <a href='https://www.researchgate.net/profile/Alex_Rogozhnikov'>Alex_Rogozhnikov</a>
        <br /> -->
    </p>


    <h2>Education and Career</h2>
    <p>
        2019-now  &mdash; principal ML scientist at Herophilus (prev. name System1 Biosciences) <br />
        2017-2018 &mdash; leading (research) engineer at Intelligence Lab, Samsung Research <br />
        2014-2017 &mdash; research scientist at Yandex (leading search engine in Russia), worked on joint research projects with
        <a href='https://home.cern/'>CERN</a>
        <br />
        <br />
        2015 &mdash; Ph.D. in computer science from Moscow State University <br />
        2014 &mdash; M.Sc. in machine learning from Yandex School of Data Analysis <br />
        2014 &mdash; M.Sc. in mathematical physics from Higher School of Economics (diploma with honors) <br />
        2012 &mdash; M.Sc. in computer science from Moscow State University (diploma with honors) <br />
    </p>
    <p>
        Previously a member of CERN experiments: LHCb, SHiP, and an associated member of the OPERA experiment at INFN.
        <!-- <br /> Winner of national olympiads in physics and mathematics. -->
    </p>

    <h2>Research</h2>

    <p>
        I develop algorithmic (and typically ML-based) solutions for the most important and intriguing problems.
    </p>
    <p>
        At Herophilus (biotech startup growing "brains-in-a-dish" called cerebral organoids)
        built data pipelines and analysed TL imaging, single-cell sequencing, IF organoid slices, ICC images, video imaging of neural activity and gene expression profiles. <br />
        Reworked tooling of data analysis team, developed models/approaches for vision in company, introduced an efficient and reliable algorithm to learn genotypes during demultiplexing in single-cell sequencing. <br />
    </p>
    <p>
        At Intelligence Lab (Samsung Research) investigated various approaches to one-shot learning
        for face recognition on mobile devices.
        <br />
        Led development of neural text-to-speech synthesis, developed a data-efficient soft alignment of speech and text.
        <br />
        Also worked on object detection, active learning, and model conversion for production
    </p>

    <p>
        At Yandex/CERN developed ML approaches to different problems in high energy physics at the Large Hadron Collider.
        Developed machine learning-based approaches
        for particle identification, tracking, online data filtering, flavour tagging and particle shower detection.
    </p>
    <p>
        My methods were primarily developed for the LHCb experiment, but later spread across other CERN experiments.
    </p>

    <p>
        Previous research topics include optimal control, mathematical physics, and even solid state theory.
    </p>
    <p>
        Deep technical background helps in design of efficient algorithms and approaches,
        while wide experience allows dealing with non-trivial aspects of algorithms' existence-in-the-wild.
        And critical thinking is crucial to build proper validation.
    </p>


    <h2>Software development</h2>

    <p>
        Author and maintainer of <a href='https://github.com/arogozhnikov/einops'>einops</a> tensor manipulations for deep learning.
        Einops is used in leading AI labs including FAIR, Google AI, Microsoft Research and Deepmind.
        Einops supports all major frameworks, including pytorch, tensorflow, jax and mxnet.
    </p>
    <p>
        Wrote "<a href='https://github.com/arogozhnikov/python3_with_pleasure'>migrating to python 3 with pleasure</a>" back in 2018,
        which turned out to be rather impactful. Those days 'whether to migrate' was still a debated question.
    </p>

    <ul>
        Authored and maintained multiple open-source packages:
        <li><a href='https://github.com/arogozhnikov/hep_ml'>hep_ml</a> (machine learning algorithms for high energy physics, widely used at CERN)</li>
        <li><a href='https://github.com/yandex/rep'>yandex/REP</a> (an environment for reproducible research)</li>
        <li><a href='https://github.com/arogozhnikov/infiniteboost'>infiniteboost</a> (ML algorithm that doesn't overfit)</li>
        <li><a href='https://github.com/herophilus/demuxalot'>herophilus/demuxalot</a>  (highly optimized detection of donor from sequencing data)</li>
    </ul>

    <p>
        Dozens of other smaller things, public and proprietary:
        research versioning, highly optimized number crunchers, ML competitions platform,
        and miniaturized deep learning models for cell phones.
    </p>
    <p>
        I maintain some of my projects with multiple users for >5 years, which shaped my mind on architecture, methodology, and choice of tooling.
    </p>


    <h2>Selected publications </h2>
    <ul>
        <li>
            Rogozhnikov, A. (2022). Einops: Clear and Reliable Tensor Manipulations with Einstein-like Notation. (ICLR 2022, oral)
        </li>
        <li>
            Rogozhnikov, A., Ramkumar, P., Bedi, R., Kato, S., Escola, G.S. (2022)
            Hierarchical confounder discovery in the experiment-machine learning cycle, Cell Patterns
        </li>

        <li>
            Likhomanenko, T., Xu, Q., Collobert, R., Synnaeve, G., & Rogozhnikov, A. (2021). CAPE: Encoding Relative Positions with Continuous Augmented Positional Embeddings. (NeurIPS 2021)
        </li>

        <!-- <li>
            Rogozhnikov, A., Likhomanenko. T., "InfiniteBoost: building infinite ensembles with gradient descent." arXiv:1706.01109 (2017).
        </li> -->
        <li>
            Likhomanenko, T., Derkach, D., & Rogozhnikov, A. (2016). Inclusive Flavour Tagging Algorithm. In Journal of Physics: Conference
            Series
        </li>
        <li>
            Rogozhnikov, A. (2016). Reweighting with boosted decision trees. In Journal of Physics: Conference Series
        </li>
        <li>
            Rogozhnikov, A., Bukva, A., Gligorov, V., Ustyuzhanin, A., & Williams, M. (2015). New approaches for boosting to uniformity.
            Journal of Instrumentation
        </li>
        <li>
            Likhomanenko, T., Ilten, P., Khairullin, E., Rogozhnikov, A., Ustyuzhanin, A., & Williams, M. (2015). LHCb Topological Trigger
            Reoptimization. In Journal of Physics: Conference Series
        </li>
        <li>
            Likhomanenko, T., Rogozhnikov, A., Baranov, A., Khairullin, E., & Ustyuzhanin, A. (2015). Reproducible Experiment Platform.
            In Journal of Physics: Conference Series
        </li>
        <li>
            Alcaraz, F. C., Brankov, J. G., Priezzhev, V. B., Rittenberg, V., & Rogozhnikov, A. M. (2014). Noncontractible loops in the
            dense O (n) loop model on the cylinder. Physical Review E
        </li>
        <li>
            Rogozhnikov, A. M. (2013). Optimal control of longitudinal vibrations of composite rods with the same wave propagation time
            in each part. Differential Equations
        </li>
        <li>
            Rogozhnikov, A. M. (2012). Study of a mixed problem describing the oscillations of a rod consisting of several segments with
            arbitrary lengths. Doklady Mathematics
        </li>
        <p>
            Full list at <a href='https://scholar.google.com/citations?user=lKrlBjwAAAAJ&hl=en&oi=ao'>google scholar</a>
        </p>

    </ul>
    <h2>Conferences</h2>
    <p>
        Talks at conferences and workshops:
        ACAT 2016 (Chile),
        Heavy Flavour Data Mining Workshop 2016 (Zurich),
        International Workshop on Nuclear Emulsions 2016 (Naples),
        IML Machine Learning workshop 2017 (Geneva).

        Co-authors also presented results at
        CHEP 2015 (Okinawa, Japan), CHEP 2016 (San-Francisco, USA),
        ICML 2015 (Lille, France),
        ML prospects and applications 2015 (Berlin),
        ACAT 2017 (Seattle, USA),
        Connecting the dots 2018 (Seattle, USA),
        CSHL 2019 (NY, USA).
    </p>


    <div class='pagebreak'></div>



    <h2>Software Stack</h2>
    <p>
        Main tools: python + scientific stack + pytorch + AWS. I'm deeply familiar with python world and data stack in particular.
        <br />
        Using HTML/CSS/JS stack from time to time, enjoy WebGL and shaders.
        <br />
        Used in different projects: C# and .NET platform, have experience with C++, PHP, SQL, fortran, and a bit of Java.
        <br />
        Used in education, but not in real projects: in MATLAB, assembler; certified CUDA dev.
    </p>

    <!-- <p>
        In 2012, my team got <strong>1st place</strong> in international competition "Accelerate
        Your Code" by Intel, providing the fastest parallel DNA processing system in C++ with openmp.
    </p> -->



    <h2>Teaching</h2>
    <!-- <p>
        Materials of the courses are available on github, corresponding links are provided:
        Links are github repositories with courses materials
    </p>  -->
    <p>
        I've authored several ML courses with quite different scope.
        Courses' materials are available online
    </p>
    <ul>
        <li>
            Machine Learning in Science and Industry &mdash; invited lecturer
            (<a href='https://github.com/yandexdataschool/MLAtGradDays'>2017, Heidelberg, Germany</a>)
        </li>
        <li>
            Machine Learning in High Energy Physics school
            (<a href='https://github.com/yandexdataschool/mlhep2015'>2015: St. Petersburg, Russia</a>;
            <a href='https://github.com/yandexdataschool/mlhep2016'>2016: Lund, Sweden</a>) &mdash; lecturer
        </li>
        <li>
            Machine learning at Imperial College, London
            (<a href='https://github.com/arogozhnikov/YSDA_ICL'>2015</a>,
            <a href='https://github.com/yandexdataschool/MLatImperial2016'>2016</a>,
            <a href='https://github.com/yandexdataschool/MLatImperial2017'>2017</a>) &mdash; practical classes
        </li>
    </ul>

    <p>
        Organized and co-organized around a dozen of in-class data challenges based on kaggle platform.
        Developed/implemented specific evaluation
        metrics for "<a href='https://www.kaggle.com/c/flavours-of-physics'>Flavours of Physics</a>" challenge at
        Kaggle run by Yandex &amp; CERN.
    </p>
    <p>
        To make algorithmic part of machine learning techniques more intuitive,
        I've made a number of (quite popular) demonstrations which can be found in
        <a href='https://arogozhnikov.github.io'>my blog</a>.
    </p>




    <!--
    <h3>Blog (Brillianly wrong)</h3>
    <p>
        In 2016 I've started posting visual explanations for some topics of machine learning I find interesting.
        Currently my blog has around 75000 visitors a year.
    </p> -->
    <!--
    <div class='demonstrations'>
        <div>
            <a href="https://arogozhnikov.github.io/2016/12/19/markov_chain_monte_carlo">
                <img src="images/mini_hmc_explained.png" alt="">
                    Hamiltonian Monte Carlo explained
                </a>
            <p>
                Explanation of basic MCMC concepts followed by details about Hamiltonian MC
            </p>
        </div>
        <div>
            <a href="https://arogozhnikov.github.io/2016/06/24/gradient_boosting_explained">
                <img src="images/mini_gb_explained.png" alt="">
                    Gradient Boosting explained
                </a>
            <p>
                Interactive explanation of gradient boosting for regression
            </p>
        </div>
        <div>
            <a href="https://arogozhnikov.github.io/2016/07/05/gradient_boosting_playground">
                <img src="images/mini_gb_playground.png" alt="">
                Gradient boosting playground
                </a>
            <p>
                The right place to understand important knobs of gradient boosting
            </p>
        </div>
        <div>
            <a href="https://arogozhnikov.github.io/3d_nn">
                <img src="images/mini_3d_nn.png" alt="">
                Neural Networks visualized in 3d
                </a>
            <p>
                Testing a fancy way of visualizing simple neural network with shaders in WebGL
            </p>
        </div>
    </div>
    <p>
        I use these (and other, more simple and less detailed) demonstrations in teaching.
    </p>
    -->


    <br />
    <br />

    <h2>Detailed research projects </h2>
    <p>
        (click for details)
    </p>

    <details>
        <summary>CAPE: Encoding Relative Positions with Continuous Augmented Positional Embeddings</summary>
        <ul>
            <li>
                Transformers' attention mechanism is permutationally invariant (making "cat eats fish" and "fish eats cat" identical to transformers).
            </li>
            <li>
                Positional embedding is a common way to provide missing information about the order.
                We propose a simple yet extremely efficient and flexible strategy by augmenting encoded positions during training.
            </li>
            <li>
                We experimented with computer vision, speech recognition and machine translation to confirm better generalization and wide applicability of our method.
                Along the way we found that out method by design handles well different resolutions and stft sampling frequencies.
            </li>
            <li>
                <a href="https://openreview.net/pdf?id=n-FqqWXnWW">Read paper</a>
            </li>
        </ul>
    </details>
    <details>
        <summary>Demultiplexer for single-cell RNA-sequencing</summary>
        <ul>
            <li>
                <a href="https://en.wikipedia.org/wiki/Single_cell_sequencing">Single-cell transcriptome sequencing</a>
                - technique which employs next generation sequencing to estimate RNA content of individual cells.
            </li>
            <li>
                Multiplexing is a technique widely used to compare RNA profiles of different tissues (e.g. diseased vs healthy) within the same pipeline.
                Cells from different tissues (e.g. organoids) are pooled and then processed together.
            </li>
            <li>
                Demultiplexing is an inverse operation done computationally - by looking into resulting reads we try to guess which donor this cell could come from.
                Fortunately, everyone's genotype is unique (ok, almost everyone's) and these variations can be used by demultiplexing.
            </li>
            <li>
                My project is going further and learns genotype-specific polymorphisms from RNA-sequenced data to complement data known from
                <a href="https://www.well.ox.ac.uk/ogc/wp-content/uploads/2017/06/GSA-inputation-design-information.pdf">GSA</a>. <br />
                Resulting algorithm has ~4-5 times less error compared to demuxlet, which allowed to significantly increase a number of donors considered, which was critical in our case.
            </li>
        </ul>
    </details>
    <details>
        <summary>One-shot learning for analysis of organoid variability from microscope images</summary>
        <img src="images/BF-similarity.png" width="450" style="padding: 30px"/>
        <p>
            Developed a method to describe organoid morphology from brightfield imaging without manual labelling.
            Demonstrated its power by predicting gene expression levels from images as well as various visual phenotypes.
        </p>
        <p>
            Decomposed variability to contributing factors to guide following decisions on minimizing variability.
            Method is so sensitive it notices minor changes in protocol.
        </p>
    </details>
    <details>
        <summary>Unsupervised tissue characterization from IHC</summary>
        <p>
            Model was trained to describe tissue structure by providing a vector description for each segment,
            which allowed automated comparison and search for similar tissue.
        </p>
        <p>
            <img src="images/IF-unsupervised.png" width="600" />
        </p>
    </details>

    <details>
        <summary>Einops &mdash; new deep learning operations</summary>
        <p>
            <a href="https://github.com/arogozhnikov/einops">Einops</a> is a package and notation that reconsider tensor manipulations.
            Working on top of leading deep learning frameworks, einops makes code shorter, more readable and reliable.
        </p>
        <p>
            Einops is used by dozens of AI labs in hundreds of open-source projects.
        </p>
    </details>

    <details>
        <summary>Text-to-speech for Russian language</summary>
        <p>
            Leading development of speech synthesis technique that employs only neural networks.
            None to an adorable chatter in 2 months.
        </p>
    </details>

    <details>
        <summary>Few-shot learning for face recognition</summary>
        <p>
            Building a system for face identification/verification based on deep learning.
            The system is optimized to run on mobile devices and to deal with few-shot learning scenario
            (i.e. very few training samples for each person).
        </p>
    </details>

    <details>
        <summary>InfiniteBoost: building infinite ensembles with gradient descent (with T.Likhomanenko)</summary>
        <p>
            InfiniteBoost is a modification of gradient boosting that converges when a number of trees in the ensemble tends to infinity.
            In this approach, it is also possible to introduce automated tuning of capacity (it is a parameter that is similar
            to learning rate in gradient boosting).
            <a href='https://arxiv.org/abs/1706.01109'>Read more</a>
        </p>
    </details>

    <details>
        <summary>Particle identification at the LHCb (with D.Derkach, M.Hushchyn, T.Likhomanenko)</summary>
        <p>
            LHCb is one of four major experiments at the LHC, and it is a bit different from other experiments &mdash; LHCb is single
            arm, and analyzes particles in quite limited angle. However, the advantage of this scheme compared to other experiments,
            is that LHCb provides more information to identify particles, which makes it more precise in studies of b-physics.
        </p>
        <p>
            We prepared a major update of particle identification system with deep networks and GBDT. Important part was preparing models
            which are independent on momentum using an approach from "Boosting to uniformity" (see below).
            <a href='https://indico.cern.ch/event/589985/contributions/2430716/attachments/1398000/2131978/PID_presentation.pdf'>Read more</a>
        </p>
    </details>



    <details>
        <summary>Finding electromagnetic showers in the OPERA </summary>
        <p>
            The OPERA is an experiment placed inside a mountain in Italy, it was created to confirm neutrino oscillations (Nobel prize
            2015). In this project I've created a system that detects electromagnetic showers in the data collected by the
            OPERA. That is, among millions of base tracks it finds a small pattern of ~ hundred base tracks.
            Needle-in-a-hay problem with tough computational requirements.
            <a href='https://arogozhnikov.github.io/2017/06/24/opera.html'>Read more</a>
        </p>
    </details>

    <details>
        <summary>Reweighting with Boosted Decision Trees</summary>
        <p>
            An important problem of many analyses in high energy physics is a discrepancy between simulated data and real data. An approach
            used previously to reduce this effect can only handle discrepancies in 1-2 variables.
        </p>
        <p>
            An algorithm was proposed that directly solves reweighting problem in many dimensions and additionally addresses some issues
            important for LHCb analyses, such as handling negative weights (so-called sWeights). This tool is used in LHCb
            analyses.
            <a href='http://iopscience.iop.org/article/10.1088/1742-6596/762/1/012036/pdf'>Read more </a>
        </p>
    </details>

    <!-- <div class='pagebreak'></div> -->

    <details>
        <summary>Inclusive flavour tagging at the LHCb (with D.Derkach, T.Likhomanenko)</summary>
        <p>
            Guessing flavour of a neutral (non-charged) meson isn't easy, but required to estimate some of the standard model parameters.
            This information can be partially reconstructed by analyzing tracks left by other particles produced after collision.
        </p>
        <p>
            We came up with a simple probabilistic model which combines information from all the other tracks &mdash; and it works better
            than previous approaches, where separate analysis was performed for each type of tagging particles and for each
            meson.
            <a href='http://iopscience.iop.org/article/10.1088/1742-6596/762/1/012045/pdf'>Read more</a>
        </p>
        <p>
            Later I've tried to improve the system by including attention-like mechanism. This helps when amount of training data is
            limited.
            <a href='https://indico.cern.ch/event/595059/contributions/2497373/attachments/1431033/2198204/Rogozhnikov_InclusiveFlavourTagging.pdf'> Read more</a>
        </p>
    </details>


    <details>
        <summary>Boosting to uniformity (main author, with A. Bukva, V. Gligorov, A. Ustyuzhanin, and M. Williams)</summary>
        <p>
            Various statistical dependencies may be easy to use to improve classification, but are undesirable to influence our decisions
            (simplified examples are gender and race when using ML in hiring). Simply removing these features from training
            may not be sufficient, since other features may still have this information (e.g. photo in CV helps easily guess
            the gender, in some languages gender of a writer can be inferred from the text).
        </p>
        <p>
            We developed a method that is capable of suppressing dependency between classification result and one or more selected variables
            using specific loss (that is based on Cramer-von Mises criterion).
            <!-- Method is targeted at applications in high energy physics. -->
            <a href='http://iopscience.iop.org/article/10.1088/1748-0221/10/03/T03002/pdf'>Read more</a>
        </p>
    </details>

    <details>
        <summary>Tracking in the COMET (with E. Gillies)</summary>
        <p>
            The COMET is an experiment in high energy physics currently under construction in Japan targeted at finding charged LFV transitions.
            The goal was to prepare a fast system that efficiently selects candidate events for transitions.
        </p>
        <p>
            Using machine learning coupled with a soft modification of Hough transform we were able to improve wire-level recognition
            quality: ROC AUC from 0.95 to 0.9993.

            <a href='https://indico.shef.ac.uk/indico/event/1/session/4/contribution/45/material/slides/0.pdf'>
                Read more
            </a>
        </p>
    </details>

    <details>
        <summary>Inclusive trigger for the LHCb (contributing author)</summary>
        <p>
            Millions of collisions should be analyzed each second at the LHCb experiment, which is an enormous amount of data (that can't
            even be stored), so the experiment uses online triggers that decide which collisions to store and which can be
            deleted.
        </p>
        <p>
            Our team developed new trigger system based on MatrixNet (Yandex proprietary GBDT modification). I was responsible for speeding
            up the model and managed to compress an already trained MatrixNet ensemble from 10'000 trees to 100 without significant
            drop in quality.
            <a href='http://iopscience.iop.org/article/10.1088/1742-6596/664/8/082025/meta'>Read more</a>
        </p>
    </details>



    <details>
        <summary>Optimal boundary control of oscillations in distributed systems (PhD thesis)</summary>
        <p>
            I was in a group led by Vladimir Il'in (<a href='https://ru.wikipedia.org/wiki/Ильин,_Владимир_Александрович'>wiki</a>)
            and investigated the problems of optimal boundary control of oscillations described by wave equation (exciting
            / damping of particular oscillations by actively interacting with a system at the boundary). Typical approaches
            investigate numerical algorithms to find an approximate solution, our group developed methods to solve the problem
            analytically, hence precisely and more efficiently.
        </p>
        <p>
            I've introduced a special notation based on operator matrices to describe the problem and provided optimal solution of control
            problem for composite rods/strings of multiple parts (previous results covered only very specific case with two
            parts and additional strong requirements).
        </p>
        <p>
            This research was selected as <strong>"best student's paper in mathematics"</strong> by Russian Academy of Sciences
            in 2012.
        </p>
    </details>


    <details>
        <summary>Computing properties of dense loop model using duality with spanning web model </summary>
        <p>
            This is a research in solid state theory: both dense loop model and spanning web model are lattice models (and have corresponding
            partition functions), their nice duality made it possible to compute partition function and loops density of
            dense loop model.
            <a href='https://arxiv.org/pdf/1409.7848.pdf'>Read more</a> (my part is computations for web models).
        </p>
    </details>
    <br />
    <br />
    <br />
    <br />
</body>

</html>
